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12/3, K/l (Item 1 from file: 2) 

DIALOG (R) File 2 : INSPEC 

(c) 2006 Institution of Electrical Engineers. All rts. reserv. 

06335780 INSPEC Abstract Number: B9609- 6140C-553 , C9609-5260B- 339 
Title: Gesture estimation using color combination 

Author(s): Yoshino, K. ; Yoshikawa, K. ; Kawashima, T. ; Aoki, Y. 
Author Affiliation: Dept. of Inf. Eng., Hokkaido Univ., Sapporo, Japan 
Conference Title: ACCV '95. Second Asian Conference on Computer Vision. 
Proceedings Part vol.2 p. 4 05-9 vol.2 
Publisher: Nanyang Technol . Univ, Singapore 

Publication Date: 1995 Country of Publication: Singapore 3 vol. 

(xxxii+548+811+83 9) pp. 

ISBN: 981 00 7177 9 Material Identity Number: XX96-01801 

Conference Title: Proceedings of Second Asian Conference on Computer 

Vision. ACCV '95 

Conference Sponsor: Int. Assoc. Pattern Recognition; IEICE of Japan; Inf. 
Processing Soc. Japan; et al 

Conference Date: 5-8 Dec. 1995 Conference Location: Singapore 
Language: English 
Subfile: B C 
Copyright 1996, IEE 

Abstract: This paper introduces a method for recognizing a gesture by 
estimating the structure and motion of a human hand. The method uses a 
coloured glove to which multiple color patches... 

. . . estimated using the color combination of visible patches in the input 
scene taken by a video camera. Patches are extracted by computing the 
ratio of the color histograms of the input image and the model image 
which concatenates all the color patches on the coloured glove. 
Additionally, hand motion is estimated from the changes in the color 
combination and the trajectory of the center of gravity of the finger 
patches. The experimental results evaluate the validity of the proposed 
method. 

...Descriptors: motion estimation 

...Identifiers: video camera; color histograms; model image 

12/3,K/2 (Item 2 from file: 2) 

DIALOG (R) File 2 : INSPEC 

(c) 2006 Institution of Electrical Engineers. All rts. reserv. 

06229134 INSPEC Abstract Number: B9605- 6140C-248 , C9605-5260B- 168 
Title: Direct estimation of human hand gesture using color combination 

Author(s): Yoshino, K. ; Kawashima, T.; Aoki, Y. 

Author Affiliation: Fac. of Eng., Hokkaido Univ., Sapporo, Japan 
Journal: Transactions of the Institute of Electronics, Information and 
Communication Engineers A vol.J79-A, no. 2 p. 424-31 
Publisher: Inst. Electron. Inf. & Cpmmun. Eng, 
Publication Date: Feb. 1996 Country of Publication: Japan 
CODEN: DJTAER ISSN: 0913-5707 
SICI : 0913-5707 (199602) J79A:2L. 424 :DEHH;1- J 
Material Identity Number: K838-96004 
Language: Japanese 
Subfile: B C 
Copyright 1996, IEE 

...Abstract: estimated using the color combination of visible patches in 
the input scene taken by a video camera. Patches are extracted by 



computing the ratio of both the color histograms of the input image 
and the model image which concatenates all the color patches on the color 
glove. Additionally, the hand motion is estimated from the change of 
the color combination and the trajectory of the center of gravity of the 
finger patches. 

...Identifiers: video camera; color histograms; model image; hand 
motion; trajectory 

12/3, K/3 (Item 1 from file: 8) 

DIALOG (R) File 8 : Ei Compendex(R) 

(c) 2006 Elsevier Eng. Info. Inc. All rts. reserv. 

05623482 E.I. No: EI P0008528 0 132 

Title: Fast motion compensation algorithm for video sequences with 
local brightness variations 

Author: Kim, Sang Hyun; Park, Rae-Hong 

Corporate Source.: Sogang Univ, Seoul, South Korea 

Conference Title: Visual Communications and Image Processing 2000 

Conference Location: Perth, Aust Conference Date: 19000620-19000623 

Source: Proceedings of SPIE - The International Society for Optical 

Engineering v 4067 (III) 2000. Society of Photo-Optical Instrumentation 

Engineers, Bellingham, WA, USA. p 1229-1238 
Publication Year: 2000 
CODEN: PSISDG ISSN: 0277-786X 
Language : Engl i sh 

Title: Fast motion compensation algorithm for video sequences with 
local brightness variations 

Abstract: In this paper, a fast motion compensation algorithm is proposed 
that improves coding efficiency for video sequences with brightness 
variations. We also propose a cross entropy measure between histograms 
of two frames to detect brightness variations. The framewise brightness 
variation parameters, a multiplier and an offset field for image intensity 

...ratio (PSNR) compared with the conventional method, with a greatly 
reduced computational load, when the video scene contains illumination 
changes. (Author abstract) 21 Refs. 

Identifiers: Motion estimation ; Brightness variation compensation; 
Brightness change detection; Cross entropy; Peak signal to noise ratio 
(PSNR) 



12/3 # K/4 (Item 1 from file: 94) 

DIALOG (R) File 94 : JICST-EPlus 

(c)2006 Japan Science and Tech Corp(JST). All rts. reserv. 

02398609 JICST ACCESSION NUMBER: 95A0742266 FILE SEGMENT: JICST-E 
Hand Language Recognition Using Color Glove. 

YOSHINO KAZUYOSHI (1); KAWASHIMA TOSHIO (1); AOKI YOSHINAO (1) 
(1) Hokkaido Univ., Fac . of Eng. 

Joho Shori Gakkai Kenkyu Hokoku, 1995, VOL. 95, NO. 68 (CV-95) , PAGE. 51-58, 

FIG. 10, REF.ll 
JOURNAL NUMBER: Z0031BAO ISSN NO: 0919-6072 
UNIVERSAL DECIMAL CLASSIFICATION: 681.3:165 681.51:007.51 
LANGUAGE: Japanese COUNTRY OF PUBLICATION: Japan 

DOCUMENT TYPE: Journal 
ARTICLE TYPE: Original paper 
MEDIA TYPE: Printed Publication 



. . .ABSTRACT: estimated using the color combination of visible patches in 
the input scene taken by a video camera. Patches are extracted by 
computing the ratio of both the color histograms of the imput 
image and the model image which concatenates all the color patches on 
the color glove. Additionally, the hand motion is estimated from 
the change of the color combination and the trajectory of the 
center of gravity of the finger patches, (author abst . ) 

12/3, K/5 (Item 1 from file: 144) 

DIALOG (R) File 144 : Pascal 
(c) 2006 INIST/CNRS. All rts . reserv. 

16189472 PASCAL No.: 03-0347872 

Fast local motion- compensation algorithm for video sequences with 
brightness variations 

SANG HYUN KIM; PARK Rae-Hong 

Department of Electronic Engineering, Sogang Uni versity, Seoul 100-611, 
Korea, Republic of 

Journal: IEEE transactions on circuits and systems for video technology, 
2003, 13 (4) 289-299 
Language : Engl i sh 

Copyright (c) 2003 INIST-CNRS. All rights reserved. 

Past local motion- compensation algorithm for video sequences with 
brightness variations 

This paper proposes a fast local motion-compensation algorithm that 
improves mot ion- compensation performance for video sequences with 
brightness variations. The brightness variation parameters, a multiplier 
and an offset field for image intensity, are robustly estimated and local 
motions are compensated. We also propose the frame classification method 
based on the cross entropy between. . . 

... signal-to-noise ratio than the conventional methods, with a low 
computational load, when the video scene contains large brightness 
changes . 

English Descriptors: Motion compensation; Algorithm; Video signal 

processing; Brightness ; Histogram ; Motion estimation ; Change det 
ection Compensat 

French Descriptors: Compensation mouvement; Algorithme; Traitement signal 
video ; Brillance; Histogramme,* Estimation mouvement; Detection 
changement 

12/3,K/6 (Item 2 from file: 144) 

DIALOG (R) File 144 : Pascal 

(c) 2006 INIST/CNRS. All rts. reserv. 

14501942 PASCAL No. : 00-0165061 

Performance characterization of video -shot-change detection methods 

GARGI U; KASTURI R; STRAYER S H 

Department of Computer Science and Engineering, Pennsylvania State 
University, University Park, PA 16802, United States; Raytheon Systems 
Company, State College, PA 16802, United States 

Journal: IEEE transactions on circuits and systems for video technology, 
2000, 10 (1) 1-13 

Language : Engl i sh 



Copyright (c) 2000 INIST-CNRS. All rights reserved. 

Performance characterization of video -shot-change detection methods 

A number of automated shot -change detection methods for indexing a video 
sequence to facilitate browsing and retrieval have been proposed in recent 

years. Many of these methods use color histograms or features computed 
from block motion or compression parameters to compute frame differences. 

It is important to evaluate... 

. . .deliver a single set of algorithms that may be used by other researchers 
for indexing video databases. We present the results of a performance 
evaluation and characterization of a number of shot- change detection 
methods that use color histograms , block motion matching, or MPEG 

compressed data. 

English Descriptors: Image processing; Video signal; Database; 

Information retrieval; Automatic indexing; Color image; Segmentation; 
Data compression; Information browsing; Block matching; Scene analysis; 
Information extraction; Motion estimation ; Performance evaluation; 
Threshold detection; Edge detection; Algorithm performance; Histogram; 
Experimental result; Waveform 

French Descriptors: Traitement image; Signal video ; Base donnee; 
Recherche information; Indexation automat ique; Image couleur; 
Segmentation; Compression donnee; Navigation information; Correspondance 
bloc; Analyse scene; Extraction information; Estimation mouvement; 
Evaluation performance; Detection seuil; Detection contour; Performance 
algorithme; Histogramme; Resultat experimental; Forme onde 

Spanish Descriptors: Procesamiento imagen; Senal video ; Base dato; 
Recuperacion informacion; Indizacion automat ica; Imagen color; 
Segmentacion; Compresion dato; Navegacion informacion; Correspondencia 
bloque; Analisis escena; Extraction informacion; Estimacion movimiento; 
Evaluacion prestacion; Deteccion umbral; Deteccion contorno; Resultado 
algoritmo; Histograma; Resultado experimental; Forma onda 



18/3, K/l (Item 1 from file: 144) 

DIALOG (R) File 144 : Pascal 

(c) 2006 INIST/CNRS. All rts . reserv. 



13831712 PASCAL No.: 99-0007410 

Video segmentation using color difference histogram 
MINAR 1 98 : multimedia information analysis and retrieval : Hong Kong, 
13-14 August 1998 

LAM C F; LEE M C 

IP Horace HS, ed; SMEULDERS Arnold WM, ed 

Department of Computer Science and Engineering, The Chinese University of 
Hong Kong, Shatin, N. T, Hong Kong 

IAPR international workshop (Hong Kong CHN) 1998-08-13 
Journal: Lecture notes in computer science, 1998, 1464 159-174 
Language: English 

Copyright (c) 1999 INIST-CNRS. All rights reserved. 

Video segmentation using color difference histogram 
This paper proposes a video segmentation algorithm based on a color 

difference histogram (CDH) which is insensitive to illuminations, object 
motions and camera movements. The relative high performance of the 
algorithm relies to some extent on the newly devised video scene 
detection method (DD) based on the analysis of the changes of video frame 
differences. We have identified characteristic patterns for the changes 

of frame difference values around the frame positions involving a scene 
break, or a flashlight. The paper demonstrates experimentally that the 
proposed algorithm out -performs other existing algorithms and that the DD 
method can identify flashlights besides detecting scene breaks. 

English Descriptors: Image processing; Segmentation; Algorithm; Motion 
estimation ; Image recognition; Experimental study; Algorithm performance 



21/3,K/1 (Item 1 from file: 2) 

DIALOG (R) File 2 : INSPEC 

(c) 2006 Institution of Electrical Engineers. All rts. reserv. 

08726601 INSPEC Abstract Number: B2003-10-6135-216 , C2003 - 10- 5260D- 050 

Title: Robust automated footage analysis for professional media 
applications 

Author(s): Mateer, J.W.; Robinson, J. A. 
Author Affiliation: York Univ. , UK 

Conference Title: International Conference on Visual Information 
Engineering (VIE 2003) (IEE Conf . Publ.No.495) p. 85-8 
Publisher: IEE, London, UK 

Publication Date: 2003 Country of Publication: UK xvi+324 pp. 
ISBN: 0 85296 757 8 Material Identity Number: XX- 2003 - 02845 

Conference Title: International Conference on Visual Information 
Engineering (VIE 2003) . Ideas, Applications, Experience 

Conference Date: 7-9 July 2003 Conference Location: Guildford, UK 
Language: English 
Subfile: B C 
Copyright 2003, IEE 

Abstract: We report a method for automated video indexing and shot 
characterization that meets the specific requirements of professional 
post -product ion and archivist end users. ASAP - Automated Shot Analysis 
Program - interprets source video material in a manner consistent with 
industry practice and generates logs and searchable databases of... 

. . . test footage and rigorous metrics we show that ASAP is more robust than 
well-established colour histogram , boundary detection methods and 

effective at parsing complex camera movement. These results indicate that 
our techniques are potentially valuable, for professional application. 
...Descriptors: motion estimation ; ... 

. . . video databases. . . 

. . . video signal processing 

Identifiers: automated video indexing... 

...automated video shot characterization... 

. . . source video material interpretation. . . 

... colour histogram boundary detection methods; complex camera 
movement parsing 

21/3, K/2 (Item 2 from file: 2) 

DIALOG (R) File 2 : INSPEC 

(c) 2006 Institution of Electrical Engineers. All rts. reserv. 

08148608 INSPEC Abstract Number: B2002-02-6135E-113 , C2002-02-5260B-368 
Title: Flame recognition in video 

Author(s): Phillips, W. , III; Shah, M. ; da Vitoria Lobo, N. 

Author Affiliation: Comput . Vision Lab., Univ. of Central Florida, 
Orlando, FL, USA 

Journal: Pattern Recognition Letters vol.23, no. 1-3 p. 319-27 
Publisher: Elsevier, 

Publication Date: Jan. 2002 Country of Publication: Netherlands 
CODEN: PRLEDG ISSN: 0167-8655 



SICI : 0167-8655 (200201)23 : 1/3L . 319 : FRV; 1-C 
Material Identity Number: D719-2001-013 

U.S. Copyright Clearance Center Code: 0167-8655/02/$22 . 00 
Language : Engl i sh 
Subfile: B C 
Copyright 2002, IEE 

Title: Flame recognition in video 
Abstract: This paper presents an automatic system for fire detection in 

video sequences. We. propose a system that uses color and motion 
information computed from video sequences to locate fire. This is done by 
first using an approach that is based upon creating a Gaussian- smoothed 

color histogram to detect the fire- colored pixels, and then using a 
temporal variation of pixels to determine which of these pixels are 
actually fire pixels. Next, some spurious fire pixels are automatically 
removed using an erode operation, and some missing fire pixels are found 
using region growing method. Unlike the two previous vision-based methods 
for fire detection, our method is applicable to more areas because of its 
insensitivity to camera motion. Two specific applications, which were not 
possible with previous algorithms, are the recognition of fire in the 
presence of global camera motion or scene motion and the recognition of 
fire in movies for possible use in an automatic rating system. We show that 
our method works in a variety of conditions, and that it can automatically 
determine when it has insufficient information. 

...Descriptors: motion estimation ; object recognition 
...Identifiers: video sequences... 

. . . motion estimation ; .... 

. . . color histogram ; region growing method; skin detection; computer 
vision; change detection 



21/3,K/3 (Item 3 from file: 2) 

DIALOG (R) File 2:INSPEC 

(c) 2006 Institution of Electrical Engineers. All rts. reserv. 

07868422 INSPEC Abstract Number: B2001 - 04 -6135C- 128 , C2 001- 04- 5260D- 079 

Title: Fast motion compensation algorithm for video sequences with local 
brightness variations 

Author (s) : Sang Hyun Kim; Rae-Hong Park 

Author Affiliation: Dept. of Electr. Eng., Sogang Univ., Seoul, South 
Korea 

Journal: Proceedings of the SPIE - The International Society for Optical 
Engineering Conference Title: Proc. SPIE - Int. Soc . Opt. Eng. (USA) 
vol.4067, pt.1-3 p. 1229-38 

Publisher: SPIE- Int. Soc. Opt. Eng, 

Publication Date: 2000 Country of Publication: USA 
CODEN: PSISDG ISSN: 0277-786X 

SICI : 0277 -786X (2000) 4067 : 1/3L . 1229 : FMCA; 1- 3 
Material Identity Number: C574-2000-219 

U.S. Copyright Clearance Center Code: 0277-786X/2000/$15 . 00 

Conference Title: Visual Communications and Image Processing 2000 

Conference Sponsor: SPIE; Univ. Western Australia; Inst.. Eng. Australia; 
Soc. Imaging Scu, & Technol . ; IEEE 

Conference Date: 20-23 June 2000 Conference Location: Perth, WA, 
Australia 

Language : Engl i sh 

Subfile: B C 

Copyright 2001, IEE 



Title: Fast motion compensation algorithm for video sequences with local 
brightness variations 

Abstract: In this paper, a fast motion compensation algorithm is proposed 
that improves coding efficiency for video sequences with brightness 
variations. We also propose a cross entropy measure between histograms 
of two frames to detect brightness variations. The framewise 

brightness variation parameters, a multiplier and an offset field for image 
intensity. . . 

. . . noise ratio compared with the conventional method, with a greatly 
reduced computational load, when the video scene contains illumination 
changes . 

. . .Descriptors: motion estimation ; ... 

. . . video coding 

. . .Identifiers: video sequences. . . 

. . . video scene 



21/3, K/4 (Item 4 from file: 2) 

DIALOG (R) File 2 : INSPEC 

(c) 2006 Institution of Electrical Engineers. All rts. reserv. 

07785016 INSPEC Abstract Number: B2001-01-6135E-096, C2001- 01- 1250M- 058 

Title: Real-time object tracking and human face detection in cluttered 
scenes 

Author(s): Dockstader, S.L.; Tekalp, A.M. 

Author Affiliation: Dept. of Electr. & Comput . Eng., Rochester Univ., NY, 
USA 

Journal: Proceedings of the SPIE - The International Society for Optical 
Engineering Conference Title: Proc . SPIE - Int. Soc . Opt. Eng. (USA) 
vol.3974 p. 957-68 

Publisher: SPIE- Int. Soc. Opt. Eng, 

Publication Date: 2000 Country of Publication: USA 

CODEN: PSISDG ISSN: 0277-786X 

SICI : 0277-786X{2 000) 3974L. 957 : RTOT; 1-7 

Material Identity Number: C574-2000-114 

U.S. Copyright Clearance Center Code: 0277-786X/2000/$15 . 00 
Conference Title: Image and Video Communications and Processing 2000 
Conference Sponsor: SPIE; Soc. Imaging Sci. & Technol 

Conference Date: 25-28 Jan. 2000 Conference Location: San Jose, CA, 
USA 

Language: English 
Subfile: B C 
Copyright 2000, IEE 

Abstract: This paper presents a real-time video surveillance system 
which is capable of tracking multiple persons and locating faces in 
moderately complex. . . 

. . . contain them. The algorithm describes a novel integration of dynamic 
reference frame differencing and coarse motion estimation to overcome 
the various occlusion problems encountered in multiple object tracking. 
Change detection is performed. . . 

. . .updated over time to account for changes in the background, illumination 
variations, and the like. Video object segmentation establishes a mapping 
from this binary change detection map to an indexed segmentation... 



. . .We employ adaptive linear predictive filtering of the bounding box model 
in conjunction with the motion displacement estimates to accurately 
track multiple occluding objects. Once the video is segmented into 
foreground and background areas, we search within a subset of the 
foreground bounding boxes using chrominance histogram matching to 

detect facial regions. 
...Descriptors: video signal processing 
...Identifiers: real-time video surveillance system... 

...coarse motion estimation / ... 

. . . video object segmentation; binary change detection map; coarse 
directional information; adaptive linear predictive filtering; background 
area 



21/3, K/5 (Item 5 from file: 2) 

DIALOG (R) File 2 : INSPEC 

(c) 2006 Institution of Electrical Engineers. All rts. reserv. 

07427626 INSPEC Abstract Number: B2000- 01-6135-132 , C2000- 01-5260D- 054 
Title: Real-time video mosaics using luminance projection correlation 

Author (s): Nagasaka, A.; Miyatake, T. 

Author Affiliation: Central Res. Lab., Hitachi Ltd., Kokubun j i , Japan 
Journal: Transactions of the Institute of Electronics, Information and 
Communication Engineers D-II vol.J82D-II, no. 10 p. 1572-80 
Publisher: Inst. Electron. Inf. & Commun. Eng, 
Publication Date: Oct. 1999 Country of Publication: Japan 
CODEN: DTGDE7 ISSN: 0915-1923 

SICI : 0915-1923 ( 199910 ) J82DI I : 10L . 1572 : RTVM; 1 -B 
Material Identity Number: M973 - 1999- 011 
Language : Japanese 
Subfile: B C 
Copyright 1999, I EE 

Title: Real-time video mosaics using luminance projection correlation 

Abstract: This paper introduces a real-time video mosaic method which 
can iterate four processes, that is, image capturing, image registration, 
pasting, previewing, in video rate. This method first calculates 
horizontal and vertical luminance projection histograms for each frame 
image in a video and then estimates camera motion , panning and 
zooming, using correlation between the projection histograms of two 
consecutive frames. This method... 

. . . The temporal motion used to make the best matched histogram is decided 
as an actual motion estimation result. Using this method, even a 

conventional personal computer can estimate camera motion in real time 
and can obtain panoramic or high resolution pictures just after taking a 
shot . 

...Descriptors: motion estimation ; ... 

. . . video signal processing 

...Identifiers: real-time video mosaic... 

. . . motion estimation ; personal computer; real time; high resolution 
pictures 
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(Item 1 from file: 8) 



DIALOG (R) File 8:Ei Compendex(R) 

(c) 2006 Elsevier Eng. Info. Inc. All rts. reserv. 

07384122 E.I. No: EIP05189083698 

Title: Intelligent keyframe extraction for video printing 
Author: Zhang, Tong 

Corporate Source: Hewlett-Packard Laboratories, Palo Alto, CA 94304, 
United States 

Conference Title: Internet Multimedia Management Systems V 
Conference Location: Philadelphia, PA, United States Conference Date: 
20041026-20041028 

E.I. Conference No.: 64621 

Source: Proceedings of SPIE - The International Society for Optical 
Engineering Internet Multimedia Management Systems V v 5601 2004. 
Publication Year: 2004 
CODEN: PSISDG ISSN: 0277-786X 
Language : Engl i sh 

Title: Intelligent keyframe extraction for video printing 
Abstract: Nowadays most digital cameras have the functionality of taking 
short video clips, with the length of video ranging from several 
seconds to a couple of minutes. The purpose of this research is to develop 
an algorithm which extracts an optimal set of keyframes from each short 
video clip so that the user could obtain proper video frames to print 
out. In current video printing systems, keyframes are normally obtained 
by evenly sampling the video clip over time. Such an approach, however, 
may not reflect highlights or regions of interest in the video . 
Keyframes derived in this way may also be improper for video printing in 
terms of either content or image quality. In this paper, we present an. . . 

. . .keyframe extraction approach to derive an improved keyframe set by 
performing semantic analysis of the video content. For a video clip, a 
number of video and audio features are analyzed to first generate a 
candidate keyframe set. These features include accumulative color 
histogram and color layout differences, camera motion estimation , 
moving object tracking, face detection and audio event detection. Then, 
the candidate keyframes are clustered. . . 

...different people and their actions in the scene; and to tell the story 
in the video shot. Moreover, frame extraction for video printing, 
which is a rather subjective problem, is considered in this work for the 
first time, and a semi-automatic approach is proposed. 8 Refs. 

Descriptors: *Vide o cameras; Information theory; Intelligent agents; 
Motion estimation ; Image analysis; Face recognition; Tracking (position) 
; Web browsers; Audio equipment; Automation; Learning systems 

Identifiers: Keyframe extraction; Video printing; Video browsing; 
Camera motion estimation ; Object motion tracking; Face detection; 
Audio event detection; Semi-automatic keyframe extraction 
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Title: Multiple frame motion inference using belief propagation 

Author: Gao, Jiang; Shi, Jianbo 

Corporate Source: Robotics Institute Carnegie Mellon University, 
Pittsburgh, PA 15213, United States 

Conference Title: Proceedings - Sixth IEEE International Conference on 



Automatic Face and Gesture Recognition FGR 2004 

Conference Location: Seoul, South Korea Conference Date: 
20040517-20040519 

E.I. Conference No.: 63497 

Source: Proceedings - Sixth IEEE International Conference on Automatic 
Face and Gesture Recognition Proceedings - Sixth IEEE International 
Conference on Automatic Face and Gesture Recognition FGR 2004 2004. 

Publication Year: 2004 

ISBN: 0769521223 

Language: English 

...Abstract: algorithm is applied in a prototype system that can 
automatically label upper body motion from videos , without manual 
initialization of body parts. 12 Refs. 

Descriptors: *Light propagation; Motion estimation ; Inference engines 
; Automation; Problem solving; Feature extraction; Constraint theory; 
Video conferencing; Three dimensional computer graphics ; Tracking 
(position) ; Color image processing 



21/3,K/8 (Item 3 from file: 8) 

DIALOG (R) File 8:Ei Compendex(R) 

(c) 2006 Elsevier Eng. Info. Inc. All rts. reserv. 

06756403 E.I. No: EIP04118060242 

Title: SnakeToonz : A Semi- Automatic Approach to Creating Cel Animation 
from Video 

Author: Agarwala, Aseem 

Corporate Source: Dept. of Comp. Sci. and Engineering University of 
Washington, Seattle, WA, United States 

Conference Title: NPAR 2002 Symposium on Non-Photorealistic Animation and 
Rendering 

Conference Location: Annecy, France Conference Date: 20020603-20020605 
E.I. Conference No.: 62363 

Source: NPAR Symposium on Non- Photorealistic Animation and Rendering 
2002 . 

Publication Year: 2002 
Language: English 

Title: SnakeToonz: A Semi -Automatic Approach to Creating Cel Animation 
from Video 

...Abstract: that allows children and others untrained in eel animation 
to create two-dimensional cartoons from video streams and images. The 
ability to create cartoons has traditionally been limited to professional 
animation houses and trained artists. SnakeToonz aims to give- anyone with 
a video camera and a computer the ability to create compelling cel 
animation. This is done by. . . 

...of that input. A cartoon is created in a dialogue with the system. 
After recording video material the user sketches contours directly onto 
the first frame of video . These sketches initialize a set of spline-based 
active contours which are relaxed to best... 

...are closed, and the user can choose colors for the cartoon. The system 
then uses motion estimation techniques to track these contours through 
the image sequence. The user remains in the process to edit the cartoon as 
it progresses. 37 Refs. 

Descriptors: *Vide o signal processing; Animation; Interactive computer 
graphics ; Image processing; Video recording ; Color motion pictures 
Computer vision; Image quality; Algorithms 
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Title: A color vector quantization based video coder 
Author: Li, Zhu; Katsaggelos, Aggelos K. 

Corporate Source: Multimedia Communication Res. Lab. Motorola Labs, 
Schaumburg, IL, United States 

Conference Title: International Conference on Image Processing (ICIP'02) 

Conference Location: Rochester, NY, United States Conference Date: 
20020922-20020925 

E.I. Conference No.: 60384 

Source: IEEE International Conference on Image Processing v 3 2002. p 
III/673-III/676 (IEEE cat n 02ch37396) 
Publication Year: 2002 
CODEN: 85QTAW 
Language: English 

Title: A color vector quantization based video coder 
...Abstract: with limited color capability. In this paper we are 
proposing a color vector quantization-based video coder, exploiting the 
temporal stationary nature of color distribution among a group of pictures 
(GOP. . . 

...applied first to reduce the RGB image sequence into a single channel 
color index image. Motion estimation and compression is then performed 
in the index space, instead of the separate YCbCr channels. Initial 
results demonstrated that the proposed coder can provide good compression 
rates. By eliminating the need for an inverse DCT and color conversion, 
typical requirements in a JPEG/MPEG type of coders, the decoding is 
computationally very simple. This makes it suitable for certain 
applications like media playback, and visual communications with low-end 
mobile devices. 10 Refs. 
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Color computer graphics ; Motion estimation ; Decoding; Display 
devices 
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Conference Title: IEEE International Conference on Image Processing 
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Conference Location: Thessaloniki , Greece Conference Date: 

20011007-20011010 

E.I. Conference No.: 58801 

Source: IEEE International Conference on Image Processing v 2 2001. p 
69-72 (IEEE cat n 01CH37205) 
Publication Year: 2001 
CODEN: 85QTAW 
Language: English 



Title: Metrics for performance evaluation of video object segmentation 
and tracking without ground- truth 

Abstract: We present metrics to evaluate the performance of video 
object segmentation and tracking methods quantitatively when ground-truth 
segmentation maps are not available. The proposed metrics are based on the 
color and motion differences along the boundary of the estimated video 
object plane and the color histogram differences between the current 
object plane and its temporal neighbors. These metrics can be used to 
localize (spatially and/or temporally) regions where segmentation results 
are good or bad; or combined to yield a single numerical measure to 
indicate the goodness of the boundary segmentation and tracking results. 
Experimental results are presented to evaluate the segmentation map of the 
"Man" object in the "Hall Monitor" sequence both in terms of a single 
numerical measure, as well as localization of the good and bad segments of 
the boundary. 8 Refs. 

Descriptors: *Image analysis; Image segmentation; Object recognition; 
Motion estimation ; Color image processing; " Feature extraction; 
Algorithms 
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Author: Ngo, C.-W.; Pong, T.-C; Zhang, H.-J. 

Corporate Source: Department of Computer Science Hong Kong Univ. of Sci. 
and Technol . , Clear Water Bay, Kowloon, Hong Kong 

Conference Title: -ACM Multimedia 2001 Workshops- 2001 Multimedia 
Conference 

Conference Location: Ottawa, Ont . , Canada Conference Date: 
20010930-20011005 

E.I. Conference No.: 58703 

Source: Proceedings of the ACM International Multimedia Conference and 
Exhibition n IV 2001. p 51-60 
Publication Year: 2001 
Language: English 

Title: On clustering and retrieval of video shots 

Abstract: Clustering of video data is an important issue in video 
abstraction, browsing and retrieval. In this paper, we propose a two-level 
hierarchical clustering approach by aggregating shots with similar motion 
and color features. Motion features are computed directly from 2D tensor 

histograms , while color features are represented by 3D color 
histograms . Cluster validity analysis is further applied to automatically 
determine the number of clusters at each level. Video retrieval can then 
be done directly based on the result of clustering. The proposed approach 

. . .games, where motion and color are important visual cues when searching 
and browsing the desired video shots. Since most games involve two 
teams, classification and retrieval of teams becomes an interesting topic. 
To achieve these goals, nevertheless, an initial as well as critical step 
is to isolate team players from background regions. Thus, we also 
introduce approach to segment foreground objects (players) prior to 
classification and retrieval. 8 Refs. 

Descriptors: *Vide o recording; Feature extraction; Motion estimation 



; Algorithms; Tensors; Color; Image analysis; Video signal processing; 
Image segmentation; Two dimensional; Three dimensional 

Identifiers: Video shots; Hierarchi clustering; Motion and color 
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Document type: journal article Language: English 
Record type: Abstract 
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ABSTRACT : 

. . .The articulation procedure is based on the homogeneity of parameters, 
such as rigid 3-D motion , color, and depth, estimated for each 
subobject, which consists of a number of interconnected triangles of the 
3-D model. The rigid 3-D motion of each subobject for subsequent frames is 
estimated using a Kalman filtering algorithm, taking into account the 
temporal correlation between consecutive frames. Information from all 
cameras is combined during the formation of the equations for the rigid 3-D 
motion parameters. The threshold used in the object segmentation procedure 
is updated at each iteration using the histogram of the subobject 
parameters. The parameter estimation for each subobject and the 3-D model 
segmentation procedures are interleaved and repeated iteratively until a 
satisfactory object segmentation emerges. The performance of the resulting 
segmentation method is evaluated experimentally. 

DESCRIPTORS: CORRELATION METHOD; FILTER THEORY; IMAGE SEGMENTATION; IMAGE 
SEQUENCES; KALMAN FILTERS; PARAMETER ESTIMATION ; COMPUTER CONFERENCING; 

COLOR ; TEMPORAL CORRELATION; ITERATIVE METHOD; HISTOGRAMS ; MOTION 
ESTIMATION ; STEREO IMA GE PROCESSING; VIDEO SIGNAL PROCESSING 
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Intelligent keyframe extraction for video printing 

Nowadays most digital cameras have the functionality of taking short 
video clips, with the length of video ranging from several seconds to a 
couple of minutes. The purpose of this research is to develop an algorithm 
which extracts an optimal set of keyframes from each short video clip so 
that the user could obtain proper video frames to print out. In current 
video printing systems, keyframes are normally obtained by evenly 
sampling the video clip over time. Such an approach, however, may not 
reflect highlights or regions of interest in the video . Keyframes derived 
in this way may also be improper for video printing in terms of either 
content or image quality. In this paper, we present an. . . 

. . . keyframe extraction approach to derive an improved keyframe set by 
performing semantic analysis of the video content. For a video clip, a 
number of video and audio features are analyzed to first generate a 
candidate keyframe set. These features include accumulative color 
histogram and color layout differences, camera motion estimation , 
moving object tracking, face detection and audio event detection. Then, the 
candidate keyframes are clustered. . . 

. . . different people and their actions in the scene; and to tell the story 
in the video shot. Moreover, frame extraction for video printing, which 
is a rather subjective problem, is considered in this work for the first 
time, and a semi-automatic approach is proposed. 

English Descriptors: Video signal; Image content; Image quality; Semantic 
analysis; Image analysis; Image processing; Content analysis; Multimedia; 
Histogram; Mobility; Motion estimation ; Facies; Printing; 
Displacement measurement; Moving body; Proximity detector; Optimal 
algorithm; Sampling; Interest region 

French Descriptors: Signal video ; Contenu image; Qualite image; Analyse 
semantique; Analyse image; Traitement image; Analyse contenu; Multimedia; 
Histogramme; Mobilite; Estimation mouvement; Facies; Impression; Mesure 
deplacement; Corps mobile; Detecteur proximite; Algorithme optimal; 
Echantillonnage; Region interet 

Spanish Descriptors: Senal video ; Contenido imagen; Calidad imagen; 
Analisis semantico,* Analisis imagen,* Procesamiento imagen; Analisis 
contenido; Multimedia; Histograma; Movilidad; Estimacion movimiento; 
Facies; Impresion; Medicion desplazamiento; Cuerpo movil; Detector 
proximidad; Algoritmo optimo; Muestreo; Region interes 
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WINDWARD SURFACE ENTRY TEMPERATURE DISTRIBUTION. 
Author: Martin, F. W. Jr. ; Schuster, D. J. 

Corporate Source: NASA, Lyndon B. Johnson Space Cent, Houston, TX, USA 
Conference Title: AIAA 20th Thermophysics Conference. 

Conference Location: Williamsburg, VA, USA Conference Date: 19850619 
E.I. Conference No.: 06745 

Source: AIAA Paper Publ by AIAA, New York, NY, USA AIAA- 85- 1027 , 6p 

Publication Year: 1985 

CODEN: AAPRAQ ISSN: 0146-3705 

Language: English 

Title: COLOR GRAPHICS PRESENTATION OF THE SPACE SHUTTLE ORB ITER 

WINDWARD SURFACE ENTRY TEMPERATURE DISTRIBUTION. 
Author: Martin, F. W. Jr. ; Schuster, D. J. 

...Abstract: measured during the fifth entry of the Space Shuttle 
Columbia is presented as solid filled color contour plots . These plots 
show the data from 92 instruments, at selected points in time, in a manner 
which makes the temperature extremes and gradients immediately obvious. 
Several physical phenomena, such as separated flow caused by the deflected 
body flap, local heating at the elevon-elevon gap, an overview of the 
propagation of boundary- layer transition over the Orbiter windward surface, 
and the thermal response of eight catalytically coated tiles, can be 
observed or inferred from the displayed temperatures. In addition, the 
maximums from each instrument have been contoured and are presented with a 
companion plot showing the percentage of surface area covered by each 
contour level. Also, the flight data are presented using the colors which 
correspond to the surface emittance as a function of temperature. To show 
the temperature transients, a computer-generated movie has been produced 
showing the temperature contours from 100 s to 1500 s after entry 
interface. 9 refs. 

Identifiers: COLOR GRAPHICS PRESENTATION; ORBITER WINDWARD SURFACE; 
DATA FROM 92 INSTRUMENTS; LAMINAR HEATING SEQUENCE 
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Micronas Presents a Quantum Leap in the TV Viewing Experience. 

PR Newswire, pNA 
Jan 4, 2005 

Language: English Record Type: Fulltext 
Document Type: Newswire; Trade 
Word Count: 779 

source material and the 60 frames per second of flat panels TVs. 
Accurate vector-based motion estimation makes these fill-in frames as 
sharp as the originals. truD further improves the image quality by 
enhancing image contrast and sharpness with advanced video algorithms. 
These include peaking, sub-pixel luminance sharpness enhancement (LSE) , 
chrominance sharpness enhancement (CSE) , and dynamic histogram -based 
contrast adjustment. 

The FRC 942 9A integrates all the functions of a high-end frame rate 
converter for DTV, including video memory, in one monolithic IC. It is 
ideally suited to work together with video systems solutions for CRT, LCD, 
Plasma, and Digital Light Projection (DLP) displays, such as the Micronas 
deflection processor DDP3315/16 or the DTV scaler DPS9455B. 

The FRC 9429A comes in a QFP-144 package. Fully qualified samples 
and reference designs are available now and volume production has started 
with major OEMs. Prices for high quantities range from approximately $20 to 
$26 (US), depending on the product version and volume. 

About Micronas 

Micronas, a semiconductor designer and manufacturer with worldwide 
operations, is a leading supplier of cutting- edge IC and sensor system 
solutions for consumer and automotive electronics. As a market leader in 
innovative, global TV system solutions, Micronas leverages its expertise 
into new markets emerging through the digitization of audio and video 
content. Micronas serves all major consumer brands worldwide, many of them 
in continuous partnerships seeking joint success. While the holding is 
headquartered in Zurich (Switzerland) , operational headquarters are based 
in Freiburg (Germany) . Currently, the Micronas Group employs about 1900 
people. In 2003, it generated CHF 767 million in sales. For more 
information on Micronas and its products, please visit 
http: //www. micronas . com/ . 

CONTACT: Micronas Press Office, +49-761-517-2324, or fax, 
+49-761-517-2622, or media@micronas . com; or Anja Maria Hastenrath, +49 171 
1959330, for Micronas GmbH 

Web site*. http://www.micronas.com/ 
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Video compression/decompression chips aim at wide range of applications. 
(Integrated Information Technologies' Vision Control Processor and Audio 
Digital Imaging 1 s Apogee M-l processor family) (Software & Development 
Tools) (Product Announcement) 
Williams, Tom 

Computer Design, v32, nl2, p40(2) 
Dec, 1993 

DOCUMENT TYPE: Product Announcement ISSN: 0010-4566 LANGUAGE: 

ENGLISH RECORD TYPE: FULLTEXT; ABSTRACT 

WORD COUNT: 842 LINE COUNT: 00066 

RISC processor, which among other things, handles error correction, 
multiplexing of the compressed audio and video , and parsing of the bit 
stream protocol. Standard production versions of the VCP have microcode 
burned- in, but custom designs can also be implemented. An on-chip video 
processor unit handles the MPEG, JPEG and H.261 compression/decompression 
operations using algorithms stored. . . 

. . . implementation of the compression algorithm, which McNif f e says are best 
achieved with a programmable platform. Motion estimation and error 
correction are likewise carried out under program control. For honing 
picture quality, the VCP offers pre- and post-processing with programmable 
filtering, scaling, color -conversion and graphics -overlay functions. 
With the ADI family, you can choose to use the M-l codec... 

...which, in turn, yields lower compressed data rates. Filter coefficients 
are software-selectable. The ME motion - estimation chip performs 
realtime vector searches on 4 x 4 pixel blocks with half-pixel accuracy. 

Both Audio Digital Imaging and Integrated Information Technologies 
appear to be targeting the same arena of applications: MPEG 
compression/decompression, video conferencing, multimedia applications and 
decode-only applications such as CD-ROM and cable broadcast of digitally 
compressed programs. Both companies are convinced that cost is the primary 
consideration, yet their approaches differ: modularity and hard-coding on 
the part of ADI, and high integration and programmability on the part of 
IIT. 

Quantity 1,000 prices are between $140 and $400 for the VCP (depending 
on speed grade) and about $500 for the M-l (about $1,000 for all three 
chips). IIT's aggressive pricing appears to have the edge. In addition, IIT 
is reportedly preparing a stripped-down version of the VCP for decode-only 
applications, which could make the price competition even hotter. 
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...source material and the 60 frames per second of flat panels TVs. 
Accurate vector-based motion estimation makes these fill-in frames as 
sharp as the originals. truD further improves the image quality by 
enhancing image contrast and sharpness with advanced video algorithms. 
These include peaking, sub-pixel luminance sharpness enhancement (LSE) , 
chrominance sharpness enhancement (CSE) , and dynamic histogram -based 
contrast adjustment. 

The FRC 9429A integrates all the functions of a high-end frame rate 
converter for DTV, including video memory, in one monolithic IC. It is 
ideally suited to work together with video systems solutions for CRT, LCD, 
Plasma, and Digital Light Projection (DLP) displays, such as the Micronas 
deflection processor DDP3315/16 or the DTV scaler DPS9455B. 

The FRC 9429A comes in a QFP-144 package. Fully qualified samples and 
reference designs are available now and volume production has started with 
major OEMs. Prices for high quantities range from approximately $20 to $26 
(US) , depending on the product version and volume. 

About Micronas 

Micronas, a semiconductor designer and manufacturer with worldwide 
operations, is a leading supplier of cutting-edge IC and sensor system 
solutions for consumer and automotive electronics. As a market leader in 
innovative, global TV system solutions, Micronas leverages its expertise 
into new markets emerging through the digitization of audio and video 
content. Micronas serves all major consumer brands worldwide, many of them 
in continuous partnerships seeking joint success. While the holding is 
headquartered in Zurich (Switzerland) , operational headquarters are based 
in Freiburg (Germany) . Currently, the Micronas Group employs about 1900 
people. In 2003, it generated CHF 767 million in sales. For more 
information on Micronas and its products, please visit 
http : / /www. micronas . com/ . 

CONTACT: Micronas Press Office, +49-761-517-2324, or fax, 
+49-761-517-2622, or media@micronas.com; or Anja Maria Hastenrath, 
+49 171 1959330, for Micronas GmbH 



Web site: http://www.micronas.com/ 
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...SPECIFICATION from the colors in the current frame 250, where C is the 
total number of colors . Then, the ratio histogram RH j , j ) ) =1,...,C 
799, is defined as This histogram, RHj , j ) ) =1 , . . . , C , is back-projected 
onto the current frame 250, that is, the image values are replaced by the 
values of RHj , j ) ) =1 , . . . , C, that they index. The back-projected image 2 99 
is then convolved by a mask 901, which for compact objects of unknown 
orientation could be a circle with the same average area as the expected 
area subtended by the object in the reference frames. Referring to Fig. 7, 
then, the peaks in the convolved image are picked as the expected 
location of the object in the current frame. The masks Mk, k=l,...,K, 
951-952 are constructed where K denote the total number of peaks selected 
in the current frame, by shifting the center of the mask used for 
convolving the backprojected image to the image locations 
(xk) ) , yk) ) ) , k=l , . . . , K 498 of the peaks . These masks Mk) ) , k=l , . . . , K are 
then intersected with the estimated rectangle for the marker in Step 130, 
whose image plane location is defined by its center, width and height 
(r (AND) )x) ) ,r(AND) )y) ) ) , w&) and h&) 351. The masks that have no 
intersection are eliminated from the list. Further eliminated are the 
ones that do not contain the predicted location (x (AND) ) , y (AND) ) ) 451 of 
the object, among the remaining masks M (AND) ) k) ) , k=l , . . . , k (AND) ) 951. 
After all redundant masks are eliminated the rectangle RR 352 whose 
center, width and height are denoted by (Rx) ) , Ry) ) ) , W and H, 
respectively is constructed . The pixels with minimum and maximum 



horizontal coordinates in the current frame in the remaining masks 
M(AND) )k) ) ,k=l, . . . ,k(AND) ) 951 are found. Let ymin) ) and ymax) ) denote 
the horizontal coordinates of these pixels. Similarly, the pixels with 
minimum and maximum vertical coordinates are found, and denote the 
vertical coordinates of these by xmin) ) and xmax) ) , respectively. Then, 
letting The rectangle RR is then set as the search space for the template 
matching that is disclosed in Step 160. If the current frame is in 
between the start and end frames then the confidence level of tracking 
for the current frame is set to the average of those for the start and 
end frames. Otherwise, the confidence level of tracking is set to the 
value for the end frame. Next, Step 160 is considered: 

Identifying the marker of the object in the current frame based on 
template matching. Referring to Fig. 8, within the search space defined 
by the rectangle RR 352 in the current frame 250, then a search is 
performed for the best location 450 and marker 350 of the object either 
using methods for template matching by correlation or robust mean ...321, 
1993. H. S. Sawhney, S. Ayer, and M. Gorkani, "Model-based 2D&3D dominant 

motion estimation for mosaicing and video representation," Int. 
Conf . Computer Vision, 1995. 

The correlation metric is defined as follows: where M{I, j) denote the 
reference template, I(T(i,j)) denote the intensity distribution of the 
current frame 250 under the spatial transformation T, and N denote the 
total number of pixels in the template. The spatial transformation T is 
defined as where z denote the zoom factor, and dx) ) , and dy) ) denote the 
vertical and horizontal displacements. 

The zoom factor z in this formula accounts for the camera zoom in and 
out and the object's motion with respect to the camera. When the camera 
zooms in and out to the object or the scene, factor z should increase and 
decrease, respectively. When the zoom factor z equals 1.0, then the 
motion of the object or the camera is only translational and the size of 
the object does not change. When the zoom factor z is less than or larger 
than 1.0, then the object gets smaller or bigger, respectively. 

The robust mean square error is formulated as where (alpha) is a 
constant value, which is preferably set to 10. 

In order to find the best marker location, the location that gives the 
least mean square error MSE or the maximum correlation C value, in the 
current frame 250, logarithmic or exhaustive search strategies are 
applied in 3-D space, ( z , dx) ) , dy) ) ) space. 

In order to enhance the tracking performance, the object template is 
also divided into sub- templates and do the search for each sub- template 
separately. Then, from the motion models for those among the 
sub- templates that results in lower mean square error MSE or higher 
correlation C value compared to others are selected to fit a global 
motion model for the object. 

Next, Step 170 is considered: Updating the confidence level of the 
tracking in the current frame. Depending on the value of the maximum 
correlation, or the minimum matching value, whichever is used during 
template matching, the confidence level of tracking is updated. If the 
correlation metric C is used for template matching, then the best 
correlation value is multiplied by 100 to obtain the confidence level of 
the tracking in the current frame 250. The confidence level of the 
tracking in the current frame is set to 100X (1 . 0-MSEbest) ) ) , where 
MSEbest)). is the minimum mean square error for the template matching. 

Next, Step 180 is considered: Finding the location and shape of the 
object in the current frame and updating the template of the object in 
the current frame. Inside the computed marker of the object in the 
current frame 250, the backprojected image is convolved with a mask. A 
mask is selected in the shape of the object in reference frame whose 
template is used for matching. All the image values are ordered within 
the marker of the object in the convolved back-projected image. The 



ratios (gamma) 1)), and (gamma) 2)) of the region covered by the object are 
computed within its corresponding shape in the start and end frames, 
respectively. Using the following formula. A ratio (gamma) (AND) ) for the 
current frame 2 50 is computed using the following formula: 

Out of the pixels that correspond to the (gamma) (AND) ) percent of the 
ordered image in the convolved back-projected image, a mask is 
constructed . A morphological opening and closing is applied to obtain 
the final mask for the object in the current frame 250. The boundary of 
this mask is then set to the boundary of the object 550 in the current 
frame 250. 

Next, Step 190 is considered: Piecewise linear interpolation. Once the 
marker and locations of the object in a subset of frames in between the 
first and last frames are computed, piecewise linear interpolation is 
applied to obtain the marker and location of the object for every frame. 
The processed frames are ordered according to their time indices. The 
marker and location of the object in every frame in between any 
consecutive frames in the ordered processed frames are then estimated. 
Letting tl) ) and t2) ) denote the time indices for such consecutive 
frames, and t denote the time index for a frame in between them the same 
linear interpolation formulas described in Step 130 are utilized to find 
the location and marker of the object at time t, from the location and 
marker of the object at instants tl)) and t2)). 

The tool in accordance with the present invention provides a marker 
location for a selected object in every frame in between two selected 
frames of a video. Rather than tracking the object in every frame, the 
method in accordance with the invention tracks the object and the marker 
for a subset of frames in between the first and last frames. Then, using 
the marker locations in these frames, an interpolation is carried out of 
the marker locations in the other frames that are not processed. 

In summary, the present invention provides a method for tracking a 
video object in an image sequence in between two time instances given the 
locations and shape of the object in those time instances. The method in 
accordance with the invention provides a real-time reliable approach to 
tracking objects in complex environments. The trajectory information 
extracted from the marker and location information in at least two or 
more frames that are already processed is used to predict the location 
and marker of the object in any frame between the selected time 
instances. In addition, the color information of the object in these 
selected time instances is used to obtain a second prediction for the 
marker and location of the object in 
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compressed domains. 

For example, in the article by S.W. Smoliar et al . , "Content-Based Video 

Indexing and Retrieval," IEEE Multimedia, summer 1994, pp. 62-72, a 
color histogram comparison technique is proposed to detect scene cuts 
in the spatial (uncompressed) domain. In the article by B. Shahraray, 
"Scene Change Detection 

and Content-Based Sampling of Video Sequences," SPIE Conf . Digital 
Image 

Compression: Algorithms and Technologies 1995, Vol. 2419, a block-based 
match and motion estimation algorithm is presented. 

For compressed video information, the article by F. Arnian et al . , "Image 
Processing on Compressed Data for Large Video Databases," Proceedings of 
ACM Multimedia 1 93, June 1993, pp. 267-272, proposes a technique for 
detecting scene cuts in JPEG compressed images by comparing the DCT 
coefficients of selected blocks from each frame. Likewise, the article by 
J. Meng et al . , "Scene Change 

Detection in a MPEG Compressed Video Sequence," IS&T/SPIE Symposium 
Proceedings, Vol. 2419, Feb. 1995, San Jose, California, provides a 
methodology for the detection of direct scene cuts based on the 
distribution of motion vectors, and a technique for the location of 
transitional scene cuts based on DCT DC coefficients. Algorithms 
disclosed in the article by M.M. Yeung, et al . "Video Browsing using 
Clustering and Scene Transitions on Compressed Sequences," IS&T/SPIE 
Symposium Proceedings, Feb. 1995, San Jose, California. Vol. 2417, pp. 
399-413, enable the browsing of video shots after scene cuts are located. 

However, the Smoliar et al . , Shahraray, and Arnian et al . references are 
limited to scene change detection, and the Meng et al . and Yeung et al . 
references do not provide any functions for editing compressed video. 

Others in the field have attempted to address problems associated with 
camera operation and moving objects in a video sequence. For example, in 
the spatial domain, H.S. Sawliney, et al . , "Model-Based 2D & 3D Dominant 
Motion Estimation for Mosaicking and Video Representation," Proc. Fifth 
Int ■ 1 conf . 

Computer Vision, Los Alamitos, CA. , 1995, pp. 583-390, proposes to find 
parameters of an affine matrix and to construct a mosaic image from a 
sequence of video images. In similar vain, the work by A. Nagasaka et 
al . , "Automatic Video Indexing and Full -Video Search for Object 
Appearances," in E. Knuth and L. M. 

Wegner, editors, Video Database Systems, II, Elsevier Science Publishers 
B.V., North-Holland, 1992, pp. 113 - 127, proposes searching for object 
appearances and using them in a video indexing technique. 

In the compressed domain, the detection of certain camera operations, 



e.g., zoom and pan, based on motion vectors have been proposed in both A. 
Akutsu et al . , "Video Indexing Using Motion Vectors," SPIE Visual 
Communications and image Processing 1992, Vol. 1818, pp. 1522-1530, and 
Y.T. Tse et al., "Global Zoom/Pan Estimation and Compensation For Video 
Compression" Proceedings of 

ICASSP 1991, pp 2728. In these proposed techniques, simple three 
parameter models are employed which require two assumptions, i.e., that 
camera panning is slow and focal length is long. However, such 
restrictions make the algorithms not suitable for general video 
processing. 

There have also been attempts to develop techniques aimed specifically at 
digital video indexing. For example, in the aforementioned Smoliar et al . 
article, the authors propose using finite state models 
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...SPECIFICATION used technique uses a color histogram. Color histograms 
have been widely used in image and video indexing and retrieval, see 
Smith et al . in "Automated Image Retrieval Using Color and Texture... 

...four bins for each channel, a total of 64 (4x4x4) bins are needed for 
the color histogram . 

Motion features 

Motion information is mostly embedded in motion vectors. Motion vectors 
can be extracted from P- and B- frames. Because motion vectors are usually 
.. .to use motion vectors are known, see Tan et al . "A new method for 
camera motion parameter estimation , " Proc . IEEE International 
Conference on Image Processing, Vol. 2, pp. 722-726, 1995, Tan et al . 
"Rapid estimation of camera motion from compressed video with 
application to video annotation," to appear in IEEE Trans, on Circuits 
and Systems for Video Technology, 1999. Kobla et al. "Detection of 
slow-motion replay sequences for identifying sports videos," Proc. IEEE 
Workshop on Multimedia Signal Processing, 1999, Kobla et al . "Special 
effect edit detection using VideoTrails: a comparison with existing 



Step 410 determines the relative intensity of motion activity for each 
frame of each shot. Each frame is classified into either a first or 
second class. The first class includes frames that are relatively easy to 
summarize, and the second class 412 includes frames that are relatively 
difficult to summarize. In other words, our classification is motion 
based. 

Consecutive frames of each shot that have the same classification are 
grouped into either an "easy" to summarize segment 411, and a "difficult" 
to summarize segment 412. 

For easy segments 411 of each shot, we perform a simple summarization 

420 of the segment by selecting a key frame or a key sequence of frames 

421 from the segment. The selected key frame or frames 421 can be any 
frame in the segment because all frames in an easy segment are considered 
to be semantically alike. 

For difficult segments 412 of each shot, we apply a color based 
summarization process 500 to summarize the segment as a key sequence of 
frames 431. 

The key frames 421 and 431 of each shot are combined in form the 
summary of each shot, and the shot summarizes can be combined to form the 
final summary S (A) 4 02 of the video. 

The combination of the frames can use temporal, spatial, or semantic 
ordering. In a temporal arrangement, the frames are concatenated in some 
temporal order, for example first-to-last, or last-to-first. In a spatial 
arrangement, miniatures of the frames are combined into a mosaic or some 
array, for example, rectangular so that a single frame shows several 
miniatures of the selected frames 
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Detailed Description 

confidence criteria could include number of pixels in the motion mask 
(too many indicates the motion estimate is off) , the degree of color 

histogram separation, the actual matching score of the template, and 
various others known to those familiar. . . 



..of the target. A best shot is the optimal, or highest quality, frame in 
a video sequence of a target for recognition purposes, by human or 
machine. The best shot may be different for different targets, including 
human faces and vehicles. The idea is not necessarily to recognize the 
target, but to at least calculate those features that would make 
recognition easier. Any technique to predict those features can be used. 

In this embodiment, the master 1 1 chooses a best shot. In the case of a 
human target, the master will choose based on the target's percentage of 
skin-tone pixels in the head area, the target's trajectory (walking 
towards the camera is good) , and size of the overall blob, in the case of 
a vehicular target, the master will choose a best shot based on the size 
of the overall blob and the target's trajectory. In this case, for 
example, heading away from the camera may give superior recognition of 
make and model information as well as license plate information. A 
weighted average of the various criteria will ultimately determine a 
single number used to estimate the quality of the image. The result of 
the best shot is that the master's inference engine 23 orders any slave 
12 tracking the target to snap a picture or obtain a short video clip. At 
the time a target becomes interesting (loiters, steals something, crosses 
a tripwire etc.), the master will make such a request. Also, at the time 
an interesting target exits the field of view, the master will make 
another such request. The master's 1 1 response engine 24 would collect 
all resulting pictures and deliver the pictures or short video clips for 
later review by a human watchstander or human identification algorithm. 

In an alternate embodiment of the invention, a best shot of the target 
is, once again, the goal. Again, the system of the first embodiment or 
the second embodiment may be employed. In this case, however, the slave's 
12 vision system 51 is provided with the ability to choose a best shot of 
the target. In the case of a human target, the slave 12 estimates shot 
quality based on skin-tone pixels in the head area, downward trajectory 
of the pan-tilt unit (indicating trajectory towards the camera), the size 
of the blob (in the case of the second 3o embodiment) , and also stillness 
of the PTZ head (the less the motion, the greater the clarity) . 



For vehicular targets, the slave estimates shot quality based on the size 
of the blob, upward pan-tilt trajectory, and stillness of the PTZ head. 
In this embodiment, the slave 12 sends back the results of the best shot, 
either a single image or a short video, to the master I 1 for reporting 
through the master's response engine 24. 

Master/Master Handoff 

In a f lirther embodiment of the invention, multiple systems may be 
interfaced with each other to provide broader spatial coverage and/or 
cooperative tracking of targets. In this embodiment, each system is 
considered to be a peer of each other system. As such, each unit includes 
a PTZ unit for positioning the sensing device. Such a system may operate, 
for example, as follows. 

Considering a system consisting of two PTZ systems (to be referred to as 
"A" and "B") , initially, both would be master systems, waiting for an 
offending target. Upon detection, the detecting unit (say, A) would then 
assume the role of a master unit and would order the other unit (B) to 
become a slave. When B loses sight of the target because of B's limited 
field of view/range of motion, B could order A. to become a slave. At this 
point, B gives A B's last known location of the target. Assuming A can 
obtain a better view of the target, A may carry on B's task and keep 
following the target. In this way, the duration of tracking can continue 
as long as the target is in view for either PTZ unit. All best shot 
f imctionality (i.e., as in the embodiments described above) may be 
incorporated into both sensors . 

The invention has been described in detail with respect to preferred 
embodiments, and it will now be apparent from the foregoing to those 
skilled in the art that changes and modifications may be made without 
departing from the invention in its broader aspects. The invention, 
therefore, as defined in the appended claims, is intended to cover all 
such changes and modifications as fall within the true spirit of the 
invention. 

CLAIMS 
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used technique uses a color histogram. Color histograms 
have been widely used in image and video indexing and retrieval, see 
Smith et 

al . in "Automated Image Retrieval Using Color and Texture," IEEE 
Transaction on Pattern Analysis ...four bins for each channel, a total of 
64 (4x4x4) bins are needed for the color histogram . 

Motion Features 

Motion information can be extracted and measured from motion vectors in P 
and. . . 

...methods for extracting motion vectors are described, see Tan 

et al . "A new methodf or camera motion parameter estimation , " Proc . 
IEEE 

International Conference on Image Processing, Vol. 2, pp. 722-726, 1995, 
Tan 

et al . "Rapid estimation of camera motionf rom compressed video with 
application to video annotation," to appear in IEEE Trans, on Circuits 
and Systems for Video Technology, 1999. Kobla et al . "Detection of 
slow-motion 

replay sequencesfor identifying sports videos," Proc. IEEE Workshop on 
Multimedia Signal Processing, 1999, Kobla et al . "Special effect edit 
detection 

using VideoTrails: a comparison with existing techniques," Proc. SPIE 
12 

Conference on Storage and Retrieval for Image and Video Databases VII, 
1999, Kobla et al . , "Compressed domain video indexing techniques using 
DCT 

and motion vector information in MPEG video," Proc. SPIE Conference on 
Storage and 
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the key involved, a non- linear key could also be used. 

In compression encoders having motion estimators which remain 
effective in the presence of luminance fades, such as those utilising 
phase 

correlationto employ 

single B pictures between reference pictures. 

The video analysis processor can generate a flash effect flag by 
looking at histogrammed luminance intensities across a number of 
pictures 

to identify sudden luminance changes. The encoder may make use of this 
flag to ensure that a picture which suffers from a flash effect is not 
used as a reference picture. In other words, the encoder may react to a 
flash effect flag by forcing the coding of a B picture. In this way, the 
encoder can avoid 

or reduce the -degradation of the encoded sequence that would normally 
accompany a photographic lighting or other flash effect. 
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Claims 

Fulltext Word Count: 3208 
English Abstract 

The present invention relates to a method of processing an input digital 
video signal (IS) so as to provide a modified digital video signal (MS) 
for a motion estimation step (ME) . Said processing method comprises the 
steps of computing (HIS) a histogram (h) of original values associated 
with pixels belonging to a video frame contained in said input digital 
video signal, analyzing (ANA) the histogram to provide histogram 
parameters (hp) , and correcting (COR) the original pixel values on 
the basis of the histogram parameters to provide modified pixel values 
, which yields the modified digital video signal to be used by the 
motion estimation step. If required, this processing method may also 
comprise a step of filtering (FIL) the modified digital video signal so 
as to provide a filtered modified digital video signal (FMS) for the 
motion estimation step. Such a processing method is adaptive to the 
content of the input digital video signal and allows the motion 
estimation step to provide better motion vectors for the purpose of 
encoding. Use: video encoder 

French Abstract 

La presente invention concerne un procede de traitement d'un signal video 
numerique d' entree (IS) permettant d'obtenir un signal video numerique 
modifie (MS) pour une etape d' estimation de mouvement (ME) . Ce procede de 
traitement consiste a calculer (HIS) un histogramme (h) des valeurs de 
depart associees a chaque pixel appartenant a une image video contenue 
dans ledit signal video numerique d' entree, a analyser (ANA) 
1 'histogramme afin d'obtenir des parametres d' histogramme (hp), puis a 
corriger (COR) les valeurs de pixel de depart sur la base des parametres 
de 1 ■ histogramme afin d'obtenir des valeurs de pixel modifiees, ce qui 
permet de produire un signal video numerique modifie destine a etre 



utilise lors de 1'etape d' estimation de mouvement . S'il y a lieu, ce 
procede de traitement peut egalement comprendre une etape de filtrage 
(FIL) du signal video numerique modi fie de facon a obtenir un signal 
video numerique modifie filtre (FMS) pour 1' etape d' estimation de 
mouvement. Ce procede de traitement peut etre adapte au contenu du signal 
video numerique d 1 entree et permet, lors de 1' etape d' estimation de 
mouvement, d' obtenir de meilleurs vecteurs de mouvement destines au 
codage. La presente invention s'applique au codage video. 

Legal Status (Type, Date, Text) 

Publication 20020124 Al With international search report. 

Publication 20020124 Al Before the expiration of the time limit for 

amending the claims and to be republished in the 
event of the receipt of amendments. 
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. . . the key involved, a non- linear key could also be used. 

In compression encoders having motion estimators which remain 
effective in the presence of luminance fades, such as those utilising 
phase 

correlationto employ 

single B pictures between reference pictures. 

The video analysis processor can generate a flash effect flag by 
looking at histogrammed luminance intensities across a number of 
pictures 

to identify sudden luminance changes . The encoder may make use of 
this 

flag to ensure that a picture which suffers from a flash effect is not 
used as a reference picture. In other words, the encoder may react to a 
flash effect flag by forcing the coding of a B picture. In this way, the 
encoder can avoid 

or reduce the -degradation of the encoded sequence that would normally 
accompany a photographic lighting or other flash effect. 
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51 7162 (HISTOGRAM??? OR GRAPH??? OR CHART?? OR PLOT???) (5N) (LUMIN- 

ANCE??? OR CHROMINANC? ? ? OR BRIGHT????? OR COLOR???? OR COLOU- 
R???? ) 

52 1485 (COMPUT??? OR ESTIMAT???? OR MEASUR???? OR CALCULAT???? OR 

DETERMIN???? OR EVALUAT???? OR RECORD???? OR DETECT????? OR E- 
VALUAT????? OR IDENTIF?????) (5N) SI 

53 286043 (CORRECT???? OR ADJUST???? OR ALTER???? OR CHANG???? OR IM- 

PROV???? OR AMEND???? OR MODIF???????) (5N) (VALU??? OR BRIGHT?- 
???? OR COLOR???? OR COLOUR???? OR LUMINANC??? OR CHROMINANC? - 
?? ) 

54 495697 VIDEO?? 

55 75153 (NEW?? OR MODIFIED?? OR CORRECTED?? OR ADJUSTED?? OR ALTER- 

ED OR CHANGED OR IMPROVED OR AMENDED) (3N) (VALUE??? OR BRIGHT? - 
???? OR COLOR???? OR COLOUR???? OR LUMINANCE??? OR CHROMINANC - 
? ? ? ) 

56 2280 MOTION?? (3N) ESTIMAT???? 

57 172443 PIXEL??? OR PICTURE? ?( 2N) ELEMENT? ? OR SUBPIXEL?? 

58 531 AU= (MARTIN F? OR MARTIN, F?) 

59 0 S2 AND S3 AND S4 AND S5 AND S6 AND S7 

510 0 S2 AND S3 AND S4 AND S5 AND S6 

511 0 S2 AND S4 AND S5 AND S6 

512 0 SI AND S3 AND S4 AND S6 

513 1 SI AND S4 AND S6 

514 0 S8 AND SI 
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Abstract 

We present an implementation of a system for con-tent- 
based search and retrieval of video based on low-level 
visual features. Currently the system consists of three 
parts, automatic video partition, feature extraction, video 
search and retrieval. Three primary features, color; 
texture and motion are used for indexing. They are 
represented by color histogram, Gabor texture features, 
and motion histogram. Most of the processing is done 
directly in the MPEG compressed domain. Testing on 
sports and movie data-bases have shown good retrieval 
performance. 
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